Scaling the ISLE Taxonomy: Development of Metrics for the Multi-Dimensional Characterisation of Machine Translation Quality
نویسندگان
چکیده
The DARPA MT evaluations of the early 1990s, along with subsequent work on the MT Scale, and the International Standards for Language Engineering (ISLE) MT Evaluation framework represent two of the principal efforts in Machine Translation Evaluation (MTE) over the past decade. We describe a research program that builds on both of these efforts. This paper focuses on the selection of MT output features suggested in the ISLE framework, as well as the development of metrics for the features to be used in the study. We define each metric and describe the rationale for its development. We also discuss several of the finer points of the evaluation measures that arose as a result of verification of the measures against sample output texts from three machine translation systems.
منابع مشابه
The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملUsing Multidimensional Scaling for Assessment Economic Development of Regions
Addressing socio-economic development issues are strategic and most important for any country. Multidimensional statistical analysis methods, including comprehensive index assessment, have been successfully used to address this challenge, but they donchr('39')t cover all aspects of development, leaving some gap in the development of multidimensional metrics. The purpose of the study is to const...
متن کاملScaling the ISLE Framework: Validating Tests of Machine Translation Quality for Multi-Dimensional Measurement
Work on comparing a set of linguistic test scores for MT output to a set of the same tests’ scores for naturally-occurring target language text (Jones and Rusk 2000) broke new ground in automating MT Evaluation. However, the tests used were selected on an ad hoc basis. In this paper, we report on work to extend our understanding, through refinement and validation, of suitable linguistic tests i...
متن کاملQuantitative Evaluation of Machine Translation Systems: Sentence Level
This paper reports the first results of an on-going research on evaluation of Machine Translation quality. The starting point for this work was the framework of ISLE (the International Standards for Language Engineering), which provides a classification for evaluation of Machine Translation. In order to make a quantitative evaluation of translation quality, we pursue a more consistent, finegrai...
متن کاملUsing the Taxonomy and the Metrics: What to Study When and Why; Comment on “Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review”
Dukhanin and colleagues’ taxonomy of metrics for patient engagement at the organizational and system levels has great potential for supporting more careful and useful evaluations of this ever-growing phenomenon. This commentary highlights the central importance to the taxonomy of metrics assessing the extent of meaningful participation in decision-making by patients, consumers and community mem...
متن کامل